由于在现实世界应用中广泛使用复杂的机器学习模型,解释模型预测变得至关重要。但是,这些模型通常是黑盒深神经网络,通过具有已知忠实限制的方法来解释事后。广义添加剂模型(GAM)是一种可解释的模型类别,通过分别学习每个功能的非线性形状函数来解决此限制,然后在顶部进行线性模型。但是,这些模型通常很难训练,需要许多参数,并且难以扩展。我们提出了一个全新的游戏亚家族,以利用形状函数的基础分解。在所有功能之间共享少数基础函数,并共同用于给定任务,因此使我们的模型比例更好地到具有高维功能的大规模数据,尤其是当功能稀疏时。我们提出了一种表示是神经基依据(NBM)的体系结构,该模型使用单个神经网络来学习这些基础。在各种表格和图像数据集上,我们证明,对于可解释的机器学习,NBMS是准确性,模型大小和吞吐量的最先进,并且可以轻松模拟所有高阶特征交互。源代码可在https://github.com/facebookresearch/nbm-pam上获得。
translated by 谷歌翻译
广义添加剂模型(GAM)迅速成为完全解释的机器学习的主要选择。但是,与不可解释的方法(例如DNNS)不同,它们缺乏表达能力和易于可扩展性,因此对于实际任务而言并不是可行的替代方法。我们提出了一个新的游戏类,该类别使用多项式的张量秩分解来学习功能强大的,{\ em完全解释}模型。我们的方法标题为“可扩展多项式添加剂模型(垃圾邮件”)是毫不舒服的可扩展性,并且模型{\ em all}的高阶特征交互没有组合参数爆炸。垃圾邮件的表现优于所有当前可解释的方法,并在一系列现实世界的基准测试中匹配DNN/XGBoost性能,并具有多达数十万个功能。我们通过人类主题评估证明,垃圾邮件在实践中明显更容易解释,因此是DNN毫不费力的替代者,用于创建适合大规模机器学习的可解释和高性能系统。源代码可在https://github.com/facebookresearch/nbm-pam上获得。
translated by 谷歌翻译
视觉反事实解释用来自干扰器图像的区域代替了查询图像中的图像区域,以使系统对转换图像的决策变为干扰器类。在这项工作中,我们提出了一个新颖的框架,用于根据两个关键思想计算视觉反事实说明。首先,我们强制执行替换和替换区域包含相同的语义部分,从而产生了更加一致的解释。其次,我们以计算上有效的方式使用多个干扰器图像,并获得更少的区域替代方法的更多歧视性解释。我们的方法在语义上一致性高27%,并且比三个细粒图像识别数据集的竞争方法要快27%。我们通过机器教学实验来强调反事实对现有作品的实用性,在这些实验中,我们教人类对不同的鸟类进行分类。我们还用零件和属性的词汇来补充我们的解释,这些零件和属性对系统的决定有所帮助。在此任务中,当使用相对于现有作品的反事实解释时,我们将获得最新的结果,从而增强了语义一致的解释的重要性。源代码可从https://github.com/facebookresearch/visual-counterfactuals获得。
translated by 谷歌翻译
域泛化涉及从异构地收集培训来源的分类器,以便它推广到从类似的未知目标域中汲取的数据,具有大规模学习和个性化推断的应用。在许多设置中,隐私问题禁止获取培训数据样本的域标签,而是只有汇总培训点集合。利用域标签来创建域不变特征表示的现有方法在此设置中不可应用,需要替代方法来学习概括的分类器。在本文中,我们提出了一个解决这个问题的域 - 自适应方法,它分为两个步骤:(a)我们在仔细选择的特征空间内培训数据来创建伪域,(b)使用这些伪域学习域 - 自适应分类器,该分类器使用有关它所属的输入和伪域的信息进行预测。我们的方法在各种域泛化基准测试中实现了最先进的性能,而无需使用域标签。此外,我们使用群集信息提供关于域泛化的新颖理论保障。我们的方法可以适用于基于集合的方法,即使在大型基准数据集上也可以提供大量的收益。代码可以在:https://github.com/xavierohan/adaclust_domainbed
translated by 谷歌翻译
Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning -the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular selfsupervised approaches to 100 million images. We show that by scaling on various axes (including data size and problem 'hardness'), one can largely match or even exceed the performance of supervised pre-training on a variety of tasks such as object detection, surface normal estimation (3D) and visual navigation using reinforcement learning. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not 'hard' enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. We also introduce an extensive benchmark across 9 different datasets and tasks. We believe that such a benchmark along with comparable evaluation settings is necessary to make meaningful progress.
translated by 谷歌翻译
State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models. Yet, ImageNet is now nearly ten years old and is by modern standards "small". Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger. The reasons are obvious: such datasets are difficult to collect and annotate. In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. Our experiments demonstrate that training for large-scale hashtag prediction leads to excellent results. We show improvements on several image classification and object detection tasks, and report the highest ImageNet-1k single-crop, top-1 accuracy to date: 85.4% (97.6% top-5). We also perform extensive experiments that provide novel empirical data on the relationship between large-scale pretraining and transfer learning performance. Name template Description train-IG-I-1.5k Instagram training set of I images and ∼1.5k hashtags from ImageNet-1k. train-IG-I-8.5k Instagram training set of I images and ∼8.5k hashtags from WordNet. train-IG-I-17k Instagram training set of I images and ∼17k hashtags from WordNet. train-IN-1M-1k The standard ImageNet-1k ILSVRC training set with 1.28M images. val-IN-50k-1k The standard ImageNet-1k ILSVRC validation set with 50k images. train-IN-I-L Extended ImageNet training set of I images and L ∈ {5k, 9k} labels. val-IN-I-L Extended ImageNet validation set of I images and L ∈ {5k, 9k} labels. train-CUB-6k-200 The Caltech-UCSD Birds-200-2011 training set. val-CUB-6k-200 The Caltech-UCSD Birds-200-2011 validation set. train-Places-1.8M-365 The Places365-Standard training set (high-resolution version). val-Places-37k-365 The Places365-Standard validation set (high-resolution version). train-COCO-135k-80 The standard COCO detection training set (2017 version). val-COCO-5k-80 The standard COCO detection validation set (2017 version). test-COCO-20k-80 The standard COCO detection test-dev set (2017 version).Table 1: Summary of image classification datasets. Each dataset is named with a template, role-source-I-L, that indicates its role (training, validation, testing), source, number of images I, and number of labels L.
translated by 谷歌翻译
Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass. However, online learning from trial-and-error for real-world robots is logistically challenging, and methods that instead can utilize existing datasets of robotic navigation data could be significantly more scalable and enable broader generalization. In this paper, we present ReViND, the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world. We evaluate our system for off-road navigation without any additional data collection or fine-tuning, and show that it can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.
translated by 谷歌翻译
The availability of challenging benchmarks has played a key role in the recent progress of machine learning. In cooperative multi-agent reinforcement learning, the StarCraft Multi-Agent Challenge (SMAC) has become a popular testbed for centralised training with decentralised execution. However, after years of sustained improvement on SMAC, algorithms now achieve near-perfect performance. In this work, we conduct new analysis demonstrating that SMAC is not sufficiently stochastic to require complex closed-loop policies. In particular, we show that an open-loop policy conditioned only on the timestep can achieve non-trivial win rates for many SMAC scenarios. To address this limitation, we introduce SMACv2, a new version of the benchmark where scenarios are procedurally generated and require agents to generalise to previously unseen settings (from the same distribution) during evaluation. We show that these changes ensure the benchmark requires the use of closed-loop policies. We evaluate state-of-the-art algorithms on SMACv2 and show that it presents significant challenges not present in the original benchmark. Our analysis illustrates that SMACv2 addresses the discovered deficiencies of SMAC and can help benchmark the next generation of MARL methods. Videos of training are available at https://sites.google.com/view/smacv2
translated by 谷歌翻译
We propose an approach for semantic imitation, which uses demonstrations from a source domain, e.g. human videos, to accelerate reinforcement learning (RL) in a different target domain, e.g. a robotic manipulator in a simulated kitchen. Instead of imitating low-level actions like joint velocities, our approach imitates the sequence of demonstrated semantic skills like "opening the microwave" or "turning on the stove". This allows us to transfer demonstrations across environments (e.g. real-world to simulated kitchen) and agent embodiments (e.g. bimanual human demonstration to robotic arm). We evaluate on three challenging cross-domain learning problems and match the performance of demonstration-accelerated RL approaches that require in-domain demonstrations. In a simulated kitchen environment, our approach learns long-horizon robot manipulation tasks, using less than 3 minutes of human video demonstrations from a real-world kitchen. This enables scaling robot learning via the reuse of demonstrations, e.g. collected as human videos, for learning in any number of target domains.
translated by 谷歌翻译
Navigation is one of the most heavily studied problems in robotics, and is conventionally approached as a geometric mapping and planning problem. However, real-world navigation presents a complex set of physical challenges that defies simple geometric abstractions. Machine learning offers a promising way to go beyond geometry and conventional planning, allowing for navigational systems that make decisions based on actual prior experience. Such systems can reason about traversability in ways that go beyond geometry, accounting for the physical outcomes of their actions and exploiting patterns in real-world environments. They can also improve as more data is collected, potentially providing a powerful network effect. In this article, we present a general toolkit for experiential learning of robotic navigation skills that unifies several recent approaches, describe the underlying design principles, summarize experimental results from several of our recent papers, and discuss open problems and directions for future work.
translated by 谷歌翻译